Multiobjective Genetic Marker Selection

نویسندگان

  • Robert Hubley
  • Eckart
چکیده

A genetic mapping project, typically implemented during a search for genes responsible for a disease, requires the acquisition of a set of data from each of a large number of individuals. This data set includes the values of multiple genetic markers. These genetic markers occur at discrete positions along the genome, which is a collection of one or more linear chromosomes. Typing the value of a marker in an individual carries a cost; one seeks to minimize the number of markers typed without excessively jeopardizing the probability of detecting an association between a marker and a disease phenotype. The probability of detecting an association between a marker and a disease phenotype decreases with distance between the marker and the actual position of the gene responsible for the phenotype. Thus, one can maximize the probability of detecting disease linkage by choosing markers as closely spaced as possible. In general, the decrease in probability of detecting association is not linear with distance; this probability tends to be relatively constant across patches of the genome known as ”haplotype blocks”. Thus one can save considerably on the cost of a mapping project by choosing no more than one marker from each haplotype block. Generally, the exact boundaries of haplotype blocks are not known prior to project execution, but it is often possible to assume that all haplotype blocks are of the same constant length s. One typically searches for a marker-disease linkage within a given locus, or possibly set of locuses, of the genome. For purposes of this paper, a ”locus” is any linear segment of the genome; in practice, a locus is typically fifty kilobases (kb) to several megabases (Mb). Each locus can be considered independently. The locations of genetic markers are known prior to project initiation. In general, the number of known genetic markers exceeds the number necessary and/or affordable for a project. Thus, prior to project initiation, one is faced with the task of selecting a subset of markers from this initial library of markers. This paper presents algorithms for the solution of this selection task. Markers in the library will have been previously characterized to a lesser or greater extent. A marker may be listed in error such that there is not in reality a marker at that position in the genome. A marker may be present in

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Multi-Objective Genetic Algorithm Approach to Feature Selection in Neural and Fuzzy Modeling

A large number of techniques, such a neural networks and neurofuzzy systems, are used to produce empirical models based in part or in whole on observed data. A key stage in the modelling process is the selection of features. Irrelevant or noisy features increase the complexity of the modelling problem, may introduce additional costs in gathering unneeded data, and frequently degrade modelling p...

متن کامل

Selecting Features in Neurofuzzy Modelling by Multiobjective Genetic Algorithms

Empirical modelling in high dimensional spaces is usually preceded by a feature selection stage. Irrelevant or noisy features unnecessarily increase the complexity of the problem and can degrade modelling performance. Here, multiobjective genetic algorithms are proposed as effective means of evolving a diverse population of alternative feature sets with various accuracy/complexity trade-offs. T...

متن کامل

A Comprehensive Fuzzy Multiobjective Supplier Selection Model under Price Brakes and Using Interval Comparison Matrices

The research on supplier selection is abundant and the works usually only consider the critical success factors in the buyer–supplier relationship. However, the negative aspects of the buyer–supplier relationship must also be considered simultaneously. In this paper we propose a comprehensive model for ranking an arbitrary number of suppliers, selecting a number of them and allocating a quota o...

متن کامل

The Impact of Different Genetic Architectures on Accuracy of Genomic Selection Using Three Bayesian Methods

Genome-wide evaluation uses the associations of a large number of single nucleotide polymorphism (SNP) markers across the whole genome and then combines the statistical methods with genomic data to predict the genetic values. Genomic predictions relieson linkage disequilibrium (LD) between genetic markers and quantitative trait loci (QTL) in a population. Methods that use all markers simultaneo...

متن کامل

Effects of Selection on Genetic Parameters of Secale montanum Based on Seed Storage Protein Marker

Secale montanum is one of the important perennial grasses growingnaturally in arid to semiarid pastures and rangelands, with a typical Mediterraneanclimate, in northern and western Iran at altitudes of 800-2900 m. In this paper, seedstorage protein profiles of nine wild populations of S. montanum from differentregions of Iran and their phenotypically superior progenies as well as a multi-origin...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002